Overview

Dataset statistics

Number of variables25
Number of observations32412
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.2 MiB
Average record size in memory200.0 B

Variable types

Numeric11
Categorical14

Alerts

arrival_date_year has constant value "2017" Constant
country has a high cardinality: 143 distinct values High cardinality
stays_in_weekend_nights is highly correlated with total_nightsHigh correlation
stays_in_week_nights is highly correlated with total_nightsHigh correlation
is_repeated_guest is highly correlated with previous_bookings_not_canceledHigh correlation
previous_bookings_not_canceled is highly correlated with is_repeated_guestHigh correlation
total_nights is highly correlated with stays_in_weekend_nights and 1 other fieldsHigh correlation
stays_in_weekend_nights is highly correlated with total_nightsHigh correlation
stays_in_week_nights is highly correlated with total_nightsHigh correlation
previous_cancellations is highly correlated with previous_bookings_not_canceledHigh correlation
previous_bookings_not_canceled is highly correlated with previous_cancellationsHigh correlation
total_nights is highly correlated with stays_in_weekend_nights and 1 other fieldsHigh correlation
stays_in_weekend_nights is highly correlated with total_nightsHigh correlation
stays_in_week_nights is highly correlated with total_nightsHigh correlation
is_repeated_guest is highly correlated with previous_bookings_not_canceledHigh correlation
previous_bookings_not_canceled is highly correlated with is_repeated_guestHigh correlation
total_nights is highly correlated with stays_in_weekend_nights and 1 other fieldsHigh correlation
is_canceled is highly correlated with arrival_date_yearHigh correlation
children is highly correlated with arrival_date_yearHigh correlation
is_repeated_guest is highly correlated with arrival_date_yearHigh correlation
stays_in_weekend_nights is highly correlated with arrival_date_yearHigh correlation
arrival_date_year is highly correlated with is_canceled and 11 other fieldsHigh correlation
adults is highly correlated with arrival_date_yearHigh correlation
meal is highly correlated with arrival_date_yearHigh correlation
reserved_room_type is highly correlated with arrival_date_yearHigh correlation
distribution_channel is highly correlated with arrival_date_yearHigh correlation
babies is highly correlated with arrival_date_yearHigh correlation
customer_type is highly correlated with arrival_date_yearHigh correlation
required_car_parking_spaces is highly correlated with arrival_date_yearHigh correlation
arrival_date_month is highly correlated with arrival_date_yearHigh correlation
id is highly correlated with is_canceled and 3 other fieldsHigh correlation
is_canceled is highly correlated with idHigh correlation
arrival_date_month is highly correlated with id and 1 other fieldsHigh correlation
arrival_date_week_number is highly correlated with id and 1 other fieldsHigh correlation
stays_in_weekend_nights is highly correlated with total_nightsHigh correlation
stays_in_week_nights is highly correlated with total_nightsHigh correlation
children is highly correlated with reserved_room_typeHigh correlation
distribution_channel is highly correlated with is_repeated_guestHigh correlation
is_repeated_guest is highly correlated with id and 1 other fieldsHigh correlation
previous_cancellations is highly correlated with previous_bookings_not_canceledHigh correlation
previous_bookings_not_canceled is highly correlated with previous_cancellationsHigh correlation
reserved_room_type is highly correlated with childrenHigh correlation
total_nights is highly correlated with stays_in_weekend_nights and 1 other fieldsHigh correlation
previous_cancellations is highly skewed (γ1 = 23.76463483) Skewed
previous_bookings_not_canceled is highly skewed (γ1 = 23.46652365) Skewed
days_in_waiting_list is highly skewed (γ1 = 24.80627415) Skewed
id has unique values Unique
lead_time has 1376 (4.2%) zeros Zeros
stays_in_week_nights has 1937 (6.0%) zeros Zeros
previous_cancellations has 32186 (99.3%) zeros Zeros
previous_bookings_not_canceled has 31362 (96.8%) zeros Zeros
booking_changes has 27745 (85.6%) zeros Zeros
days_in_waiting_list has 32235 (99.5%) zeros Zeros
total_of_special_requests has 17338 (53.5%) zeros Zeros

Reproduction

Analysis started2022-07-07 13:02:35.936813
Analysis finished2022-07-07 13:03:04.308340
Duration28.37 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

id
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct32412
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean60131.50518
Minimum6086
Maximum97903
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size253.3 KiB
2022-07-07T16:03:04.436898image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum6086
5-th percentile7726.55
Q145291.75
median53394.5
Q389800.25
95-th percentile96282.45
Maximum97903
Range91817
Interquartile range (IQR)44508.5

Descriptive statistics

Standard deviation29953.58618
Coefficient of variation (CV)0.4981346481
Kurtosis-1.310575271
Mean60131.50518
Median Absolute Deviation (MAD)32382.5
Skewness-0.2686934461
Sum1948982346
Variance897217324.9
MonotonicityStrictly increasing
2022-07-07T16:03:04.596843image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
61471
 
< 0.1%
231591
 
< 0.1%
234221
 
< 0.1%
875661
 
< 0.1%
848601
 
< 0.1%
950991
 
< 0.1%
971461
 
< 0.1%
910011
 
< 0.1%
930481
 
< 0.1%
70301
 
< 0.1%
Other values (32402)32402
> 99.9%
ValueCountFrequency (%)
60861
< 0.1%
60871
< 0.1%
60881
< 0.1%
60891
< 0.1%
60901
< 0.1%
60911
< 0.1%
60921
< 0.1%
60931
< 0.1%
60941
< 0.1%
60951
< 0.1%
ValueCountFrequency (%)
979031
< 0.1%
979021
< 0.1%
979011
< 0.1%
979001
< 0.1%
978991
< 0.1%
978981
< 0.1%
978971
< 0.1%
978961
< 0.1%
978951
< 0.1%
978941
< 0.1%

is_canceled
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size253.3 KiB
0
19821 
1
12591 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters32412
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
019821
61.2%
112591
38.8%

Length

2022-07-07T16:03:04.756886image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-07T16:03:04.896833image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
019821
61.2%
112591
38.8%

Most occurring characters

ValueCountFrequency (%)
019821
61.2%
112591
38.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number32412
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
019821
61.2%
112591
38.8%

Most occurring scripts

ValueCountFrequency (%)
Common32412
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
019821
61.2%
112591
38.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII32412
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
019821
61.2%
112591
38.8%

lead_time
Real number (ℝ≥0)

ZEROS

Distinct368
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean97.58786869
Minimum0
Maximum373
Zeros1376
Zeros (%)4.2%
Negative0
Negative (%)0.0%
Memory size253.3 KiB
2022-07-07T16:03:05.086941image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q123
median76
Q3155
95-th percentile270
Maximum373
Range373
Interquartile range (IQR)132

Descriptive statistics

Standard deviation86.50714564
Coefficient of variation (CV)0.8864538882
Kurtosis0.02388213705
Mean97.58786869
Median Absolute Deviation (MAD)61
Skewness0.8690844332
Sum3163018
Variance7483.486247
MonotonicityNot monotonic
2022-07-07T16:03:05.256786image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01376
 
4.2%
1783
 
2.4%
2492
 
1.5%
3403
 
1.2%
4391
 
1.2%
7377
 
1.2%
6366
 
1.1%
5345
 
1.1%
28313
 
1.0%
8290
 
0.9%
Other values (358)27276
84.2%
ValueCountFrequency (%)
01376
4.2%
1783
2.4%
2492
 
1.5%
3403
 
1.2%
4391
 
1.2%
5345
 
1.1%
6366
 
1.1%
7377
 
1.2%
8290
 
0.9%
9236
 
0.7%
ValueCountFrequency (%)
37327
0.1%
37211
 
< 0.1%
36836
0.1%
3672
 
< 0.1%
3661
 
< 0.1%
3652
 
< 0.1%
36413
 
< 0.1%
3612
 
< 0.1%
3593
 
< 0.1%
3583
 
< 0.1%

arrival_date_year
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size253.3 KiB
2017
32412 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters129648
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017
2nd row2017
3rd row2017
4th row2017
5th row2017

Common Values

ValueCountFrequency (%)
201732412
100.0%

Length

2022-07-07T16:03:05.436765image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-07T16:03:05.576976image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
201732412
100.0%

Most occurring characters

ValueCountFrequency (%)
232412
25.0%
032412
25.0%
132412
25.0%
732412
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number129648
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
232412
25.0%
032412
25.0%
132412
25.0%
732412
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common129648
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
232412
25.0%
032412
25.0%
132412
25.0%
732412
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII129648
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
232412
25.0%
032412
25.0%
132412
25.0%
732412
25.0%

arrival_date_month
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size253.3 KiB
May
5262 
April
4878 
June
4580 
March
4277 
July
3626 
Other values (3)
9789 

Length

Max length8
Median length7
Mean length5.039954338
Min length3

Characters and Unicode

Total characters163355
Distinct characters19
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJanuary
2nd rowJanuary
3rd rowJanuary
4th rowJanuary
5th rowJanuary

Common Values

ValueCountFrequency (%)
May5262
16.2%
April4878
15.0%
June4580
14.1%
March4277
13.2%
July3626
11.2%
February3543
10.9%
January3150
9.7%
August3096
9.6%

Length

2022-07-07T16:03:05.710258image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-07T16:03:05.957478image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
may5262
16.2%
april4878
15.0%
june4580
14.1%
march4277
13.2%
july3626
11.2%
february3543
10.9%
january3150
9.7%
august3096
9.6%

Most occurring characters

ValueCountFrequency (%)
u21091
12.9%
r19391
11.9%
a19382
11.9%
y15581
9.5%
J11356
 
7.0%
M9539
 
5.8%
l8504
 
5.2%
e8123
 
5.0%
A7974
 
4.9%
n7730
 
4.7%
Other values (9)34684
21.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter130943
80.2%
Uppercase Letter32412
 
19.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u21091
16.1%
r19391
14.8%
a19382
14.8%
y15581
11.9%
l8504
6.5%
e8123
 
6.2%
n7730
 
5.9%
i4878
 
3.7%
p4878
 
3.7%
c4277
 
3.3%
Other values (5)17108
13.1%
Uppercase Letter
ValueCountFrequency (%)
J11356
35.0%
M9539
29.4%
A7974
24.6%
F3543
 
10.9%

Most occurring scripts

ValueCountFrequency (%)
Latin163355
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u21091
12.9%
r19391
11.9%
a19382
11.9%
y15581
9.5%
J11356
 
7.0%
M9539
 
5.8%
l8504
 
5.2%
e8123
 
5.0%
A7974
 
4.9%
n7730
 
4.7%
Other values (9)34684
21.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII163355
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
u21091
12.9%
r19391
11.9%
a19382
11.9%
y15581
9.5%
J11356
 
7.0%
M9539
 
5.8%
l8504
 
5.2%
e8123
 
5.0%
A7974
 
4.9%
n7730
 
4.7%
Other values (9)34684
21.2%

arrival_date_week_number
Real number (ℝ≥0)

HIGH CORRELATION

Distinct35
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.80405405
Minimum1
Maximum35
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size253.3 KiB
2022-07-07T16:03:06.211056image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q110
median18
Q325
95-th percentile33
Maximum35
Range34
Interquartile range (IQR)15

Descriptive statistics

Standard deviation9.17738444
Coefficient of variation (CV)0.5154659951
Kurtosis-0.9762359697
Mean17.80405405
Median Absolute Deviation (MAD)7
Skewness-0.001657087894
Sum577065
Variance84.22438516
MonotonicityNot monotonic
2022-07-07T16:03:06.396905image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
171297
 
4.0%
181271
 
3.9%
201261
 
3.9%
151218
 
3.8%
211170
 
3.6%
221161
 
3.6%
141101
 
3.4%
231098
 
3.4%
191080
 
3.3%
81054
 
3.3%
Other values (25)20701
63.9%
ValueCountFrequency (%)
1703
2.2%
2720
2.2%
3731
2.3%
4780
2.4%
5683
2.1%
6658
2.0%
7989
3.1%
81054
3.3%
9953
2.9%
10962
3.0%
ValueCountFrequency (%)
35511
1.6%
34625
1.9%
33791
2.4%
32643
2.0%
31733
2.3%
30783
2.4%
29711
2.2%
28952
2.9%
27859
2.7%
261021
3.2%

arrival_date_day_of_month
Real number (ℝ≥0)

Distinct31
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.65694804
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size253.3 KiB
2022-07-07T16:03:06.611211image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median15.5
Q323
95-th percentile29
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.766429465
Coefficient of variation (CV)0.5599066587
Kurtosis-1.187567447
Mean15.65694804
Median Absolute Deviation (MAD)7.5
Skewness0.003528410818
Sum507473
Variance76.85028556
MonotonicityNot monotonic
2022-07-07T16:03:06.796806image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
151317
 
4.1%
91227
 
3.8%
21200
 
3.7%
281139
 
3.5%
251127
 
3.5%
31121
 
3.5%
271119
 
3.5%
241116
 
3.4%
141115
 
3.4%
191111
 
3.4%
Other values (21)20820
64.2%
ValueCountFrequency (%)
11057
3.3%
21200
3.7%
31121
3.5%
4923
2.8%
51099
3.4%
61076
3.3%
7878
2.7%
81066
3.3%
91227
3.8%
101038
3.2%
ValueCountFrequency (%)
31564
1.7%
30860
2.7%
29916
2.8%
281139
3.5%
271119
3.5%
261110
3.4%
251127
3.5%
241116
3.4%
231059
3.3%
22917
2.8%

stays_in_weekend_nights
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size253.3 KiB
0
13915 
2
9221 
1
9105 
3
 
101
4
 
70

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters32412
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
013915
42.9%
29221
28.4%
19105
28.1%
3101
 
0.3%
470
 
0.2%

Length

2022-07-07T16:03:06.986711image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-07T16:03:07.191486image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
013915
42.9%
29221
28.4%
19105
28.1%
3101
 
0.3%
470
 
0.2%

Most occurring characters

ValueCountFrequency (%)
013915
42.9%
29221
28.4%
19105
28.1%
3101
 
0.3%
470
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number32412
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
013915
42.9%
29221
28.4%
19105
28.1%
3101
 
0.3%
470
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common32412
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
013915
42.9%
29221
28.4%
19105
28.1%
3101
 
0.3%
470
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII32412
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
013915
42.9%
29221
28.4%
19105
28.1%
3101
 
0.3%
470
 
0.2%

stays_in_week_nights
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.34009009
Minimum0
Maximum6
Zeros1937
Zeros (%)6.0%
Negative0
Negative (%)0.0%
Memory size253.3 KiB
2022-07-07T16:03:07.376991image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.375169854
Coefficient of variation (CV)0.5876568
Kurtosis-0.421009231
Mean2.34009009
Median Absolute Deviation (MAD)1
Skewness0.451786677
Sum75847
Variance1.891092127
MonotonicityNot monotonic
2022-07-07T16:03:07.516774image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
29004
27.8%
18038
24.8%
37326
22.6%
42975
 
9.2%
52869
 
8.9%
01937
 
6.0%
6263
 
0.8%
ValueCountFrequency (%)
01937
 
6.0%
18038
24.8%
29004
27.8%
37326
22.6%
42975
 
9.2%
52869
 
8.9%
6263
 
0.8%
ValueCountFrequency (%)
6263
 
0.8%
52869
 
8.9%
42975
 
9.2%
37326
22.6%
29004
27.8%
18038
24.8%
01937
 
6.0%

adults
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size253.3 KiB
2.0
24237 
1.0
6280 
3.0
 
1817
0.0
 
69
4.0
 
9

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters97236
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row2.0
3rd row2.0
4th row1.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.024237
74.8%
1.06280
 
19.4%
3.01817
 
5.6%
0.069
 
0.2%
4.09
 
< 0.1%

Length

2022-07-07T16:03:07.696995image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-07T16:03:07.886962image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
2.024237
74.8%
1.06280
 
19.4%
3.01817
 
5.6%
0.069
 
0.2%
4.09
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
032481
33.4%
.32412
33.3%
224237
24.9%
16280
 
6.5%
31817
 
1.9%
49
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number64824
66.7%
Other Punctuation32412
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
032481
50.1%
224237
37.4%
16280
 
9.7%
31817
 
2.8%
49
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
.32412
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common97236
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
032481
33.4%
.32412
33.3%
224237
24.9%
16280
 
6.5%
31817
 
1.9%
49
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII97236
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
032481
33.4%
.32412
33.3%
224237
24.9%
16280
 
6.5%
31817
 
1.9%
49
 
< 0.1%

children
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size253.3 KiB
0.0
30360 
1.0
 
1394
2.0
 
653
3.0
 
5

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters97236
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.030360
93.7%
1.01394
 
4.3%
2.0653
 
2.0%
3.05
 
< 0.1%

Length

2022-07-07T16:03:08.076569image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-07T16:03:08.266964image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0.030360
93.7%
1.01394
 
4.3%
2.0653
 
2.0%
3.05
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
062772
64.6%
.32412
33.3%
11394
 
1.4%
2653
 
0.7%
35
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number64824
66.7%
Other Punctuation32412
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
062772
96.8%
11394
 
2.2%
2653
 
1.0%
35
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
.32412
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common97236
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
062772
64.6%
.32412
33.3%
11394
 
1.4%
2653
 
0.7%
35
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII97236
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
062772
64.6%
.32412
33.3%
11394
 
1.4%
2653
 
0.7%
35
 
< 0.1%

babies
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size253.3 KiB
0.0
32237 
1.0
 
171
2.0
 
4

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters97236
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.032237
99.5%
1.0171
 
0.5%
2.04
 
< 0.1%

Length

2022-07-07T16:03:08.746724image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-07T16:03:08.916805image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0.032237
99.5%
1.0171
 
0.5%
2.04
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
064649
66.5%
.32412
33.3%
1171
 
0.2%
24
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number64824
66.7%
Other Punctuation32412
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
064649
99.7%
1171
 
0.3%
24
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
.32412
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common97236
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
064649
66.5%
.32412
33.3%
1171
 
0.2%
24
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII97236
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
064649
66.5%
.32412
33.3%
1171
 
0.2%
24
 
< 0.1%

meal
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size253.3 KiB
BB
24684 
SC
5035 
HB
 
2399
SC
 
258
FB
 
36

Length

Max length9
Median length9
Mean length8.944279896
Min length2

Characters and Unicode

Total characters289902
Distinct characters6
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBB
2nd rowBB
3rd rowBB
4th rowBB
5th rowBB

Common Values

ValueCountFrequency (%)
BB 24684
76.2%
SC 5035
 
15.5%
HB 2399
 
7.4%
SC258
 
0.8%
FB 36
 
0.1%

Length

2022-07-07T16:03:09.066662image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-07T16:03:09.256910image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
bb24684
76.2%
sc5293
 
16.3%
hb2399
 
7.4%
fb36
 
0.1%

Most occurring characters

ValueCountFrequency (%)
225078
77.6%
B51803
 
17.9%
S5293
 
1.8%
C5293
 
1.8%
H2399
 
0.8%
F36
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Space Separator225078
77.6%
Uppercase Letter64824
 
22.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B51803
79.9%
S5293
 
8.2%
C5293
 
8.2%
H2399
 
3.7%
F36
 
0.1%
Space Separator
ValueCountFrequency (%)
225078
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common225078
77.6%
Latin64824
 
22.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
B51803
79.9%
S5293
 
8.2%
C5293
 
8.2%
H2399
 
3.7%
F36
 
0.1%
Common
ValueCountFrequency (%)
225078
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII289902
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
225078
77.6%
B51803
 
17.9%
S5293
 
1.8%
C5293
 
1.8%
H2399
 
0.8%
F36
 
< 0.1%

country
Categorical

HIGH CARDINALITY

Distinct143
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size253.3 KiB
PRT
9887 
GBR
3927 
FRA
3477 
DEU
2378 
ESP
1932 
Other values (138)
10811 

Length

Max length3
Median length3
Mean length2.985530051
Min length2

Characters and Unicode

Total characters96767
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)0.1%

Sample

1st rowPRT
2nd rowAUT
3rd rowAUT
4th rowPRT
5th rowBEL

Common Values

ValueCountFrequency (%)
PRT9887
30.5%
GBR3927
 
12.1%
FRA3477
 
10.7%
DEU2378
 
7.3%
ESP1932
 
6.0%
ITA1153
 
3.6%
IRL1060
 
3.3%
BEL882
 
2.7%
BRA881
 
2.7%
USA774
 
2.4%
Other values (133)6061
18.7%

Length

2022-07-07T16:03:09.436692image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
prt9887
30.5%
gbr3927
 
12.1%
fra3477
 
10.7%
deu2378
 
7.3%
esp1932
 
6.0%
ita1153
 
3.6%
irl1060
 
3.3%
bel882
 
2.7%
bra881
 
2.7%
usa774
 
2.4%
Other values (133)6061
18.7%

Most occurring characters

ValueCountFrequency (%)
R20621
21.3%
P12223
12.6%
T11651
12.0%
A7271
 
7.5%
E6275
 
6.5%
B5776
 
6.0%
U4465
 
4.6%
G4185
 
4.3%
S3795
 
3.9%
F3660
 
3.8%
Other values (16)16845
17.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter96767
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R20621
21.3%
P12223
12.6%
T11651
12.0%
A7271
 
7.5%
E6275
 
6.5%
B5776
 
6.0%
U4465
 
4.6%
G4185
 
4.3%
S3795
 
3.9%
F3660
 
3.8%
Other values (16)16845
17.4%

Most occurring scripts

ValueCountFrequency (%)
Latin96767
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R20621
21.3%
P12223
12.6%
T11651
12.0%
A7271
 
7.5%
E6275
 
6.5%
B5776
 
6.0%
U4465
 
4.6%
G4185
 
4.3%
S3795
 
3.9%
F3660
 
3.8%
Other values (16)16845
17.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII96767
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R20621
21.3%
P12223
12.6%
T11651
12.0%
A7271
 
7.5%
E6275
 
6.5%
B5776
 
6.0%
U4465
 
4.6%
G4185
 
4.3%
S3795
 
3.9%
F3660
 
3.8%
Other values (16)16845
17.4%

distribution_channel
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size253.3 KiB
TA/TO
27083 
Direct
3642 
Corporate
 
1602
GDS
 
85

Length

Max length9
Median length5
Mean length5.304825373
Min length3

Characters and Unicode

Total characters171940
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTA/TO
2nd rowTA/TO
3rd rowTA/TO
4th rowTA/TO
5th rowTA/TO

Common Values

ValueCountFrequency (%)
TA/TO27083
83.6%
Direct3642
 
11.2%
Corporate1602
 
4.9%
GDS85
 
0.3%

Length

2022-07-07T16:03:09.596669image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-07T16:03:09.776984image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
ta/to27083
83.6%
direct3642
 
11.2%
corporate1602
 
4.9%
gds85
 
0.3%

Most occurring characters

ValueCountFrequency (%)
T54166
31.5%
A27083
15.8%
/27083
15.8%
O27083
15.8%
r6846
 
4.0%
e5244
 
3.0%
t5244
 
3.0%
D3727
 
2.2%
i3642
 
2.1%
c3642
 
2.1%
Other values (6)8180
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter113831
66.2%
Lowercase Letter31026
 
18.0%
Other Punctuation27083
 
15.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r6846
22.1%
e5244
16.9%
t5244
16.9%
i3642
11.7%
c3642
11.7%
o3204
10.3%
p1602
 
5.2%
a1602
 
5.2%
Uppercase Letter
ValueCountFrequency (%)
T54166
47.6%
A27083
23.8%
O27083
23.8%
D3727
 
3.3%
C1602
 
1.4%
G85
 
0.1%
S85
 
0.1%
Other Punctuation
ValueCountFrequency (%)
/27083
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin144857
84.2%
Common27083
 
15.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
T54166
37.4%
A27083
18.7%
O27083
18.7%
r6846
 
4.7%
e5244
 
3.6%
t5244
 
3.6%
D3727
 
2.6%
i3642
 
2.5%
c3642
 
2.5%
o3204
 
2.2%
Other values (5)4976
 
3.4%
Common
ValueCountFrequency (%)
/27083
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII171940
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T54166
31.5%
A27083
15.8%
/27083
15.8%
O27083
15.8%
r6846
 
4.0%
e5244
 
3.0%
t5244
 
3.0%
D3727
 
2.2%
i3642
 
2.1%
c3642
 
2.1%
Other values (6)8180
 
4.8%

is_repeated_guest
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size253.3 KiB
0
31395 
1
 
1017

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters32412
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
031395
96.9%
11017
 
3.1%

Length

2022-07-07T16:03:09.936711image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-07T16:03:10.096936image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
031395
96.9%
11017
 
3.1%

Most occurring characters

ValueCountFrequency (%)
031395
96.9%
11017
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number32412
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
031395
96.9%
11017
 
3.1%

Most occurring scripts

ValueCountFrequency (%)
Common32412
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
031395
96.9%
11017
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII32412
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
031395
96.9%
11017
 
3.1%

previous_cancellations
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.01160064174
Minimum0
Maximum6
Zeros32186
Zeros (%)99.3%
Negative0
Negative (%)0.0%
Memory size253.3 KiB
2022-07-07T16:03:10.206753image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.1804726207
Coefficient of variation (CV)15.55712389
Kurtosis681.4587898
Mean0.01160064174
Median Absolute Deviation (MAD)0
Skewness23.76463483
Sum376
Variance0.03257036681
MonotonicityNot monotonic
2022-07-07T16:03:10.316652image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
032186
99.3%
1165
 
0.5%
229
 
0.1%
615
 
< 0.1%
410
 
< 0.1%
36
 
< 0.1%
51
 
< 0.1%
ValueCountFrequency (%)
032186
99.3%
1165
 
0.5%
229
 
0.1%
36
 
< 0.1%
410
 
< 0.1%
51
 
< 0.1%
615
 
< 0.1%
ValueCountFrequency (%)
615
 
< 0.1%
51
 
< 0.1%
410
 
< 0.1%
36
 
< 0.1%
229
 
0.1%
1165
 
0.5%
032186
99.3%

previous_bookings_not_canceled
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct46
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1714179933
Minimum0
Maximum72
Zeros31362
Zeros (%)96.8%
Negative0
Negative (%)0.0%
Memory size253.3 KiB
2022-07-07T16:03:10.496554image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum72
Range72
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.875170138
Coefficient of variation (CV)10.93916748
Kurtosis722.4618089
Mean0.1714179933
Median Absolute Deviation (MAD)0
Skewness23.46652365
Sum5556
Variance3.516263047
MonotonicityNot monotonic
2022-07-07T16:03:10.656941image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=46)
ValueCountFrequency (%)
031362
96.8%
1424
 
1.3%
2161
 
0.5%
387
 
0.3%
459
 
0.2%
546
 
0.1%
638
 
0.1%
734
 
0.1%
824
 
0.1%
1022
 
0.1%
Other values (36)155
 
0.5%
ValueCountFrequency (%)
031362
96.8%
1424
 
1.3%
2161
 
0.5%
387
 
0.3%
459
 
0.2%
546
 
0.1%
638
 
0.1%
734
 
0.1%
824
 
0.1%
919
 
0.1%
ValueCountFrequency (%)
721
< 0.1%
711
< 0.1%
701
< 0.1%
691
< 0.1%
681
< 0.1%
671
< 0.1%
661
< 0.1%
651
< 0.1%
641
< 0.1%
631
< 0.1%

reserved_room_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size253.3 KiB
A
23471 
D
6123 
E
 
1644
F
 
503
G
 
278
Other values (2)
 
393

Length

Max length16
Median length16
Mean length16
Min length16

Characters and Unicode

Total characters518592
Distinct characters8
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A 23471
72.4%
D 6123
 
18.9%
E 1644
 
5.1%
F 503
 
1.6%
G 278
 
0.9%
C 201
 
0.6%
B 192
 
0.6%

Length

2022-07-07T16:03:10.846564image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-07T16:03:11.126915image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
a23471
72.4%
d6123
 
18.9%
e1644
 
5.1%
f503
 
1.6%
g278
 
0.9%
c201
 
0.6%
b192
 
0.6%

Most occurring characters

ValueCountFrequency (%)
486180
93.8%
A23471
 
4.5%
D6123
 
1.2%
E1644
 
0.3%
F503
 
0.1%
G278
 
0.1%
C201
 
< 0.1%
B192
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Space Separator486180
93.8%
Uppercase Letter32412
 
6.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A23471
72.4%
D6123
 
18.9%
E1644
 
5.1%
F503
 
1.6%
G278
 
0.9%
C201
 
0.6%
B192
 
0.6%
Space Separator
ValueCountFrequency (%)
486180
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common486180
93.8%
Latin32412
 
6.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
A23471
72.4%
D6123
 
18.9%
E1644
 
5.1%
F503
 
1.6%
G278
 
0.9%
C201
 
0.6%
B192
 
0.6%
Common
ValueCountFrequency (%)
486180
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII518592
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
486180
93.8%
A23471
 
4.5%
D6123
 
1.2%
E1644
 
0.3%
F503
 
0.1%
G278
 
0.1%
C201
 
< 0.1%
B192
 
< 0.1%

booking_changes
Real number (ℝ≥0)

ZEROS

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2167407133
Minimum0
Maximum18
Zeros27745
Zeros (%)85.6%
Negative0
Negative (%)0.0%
Memory size253.3 KiB
2022-07-07T16:03:11.376884image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum18
Range18
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.6405505718
Coefficient of variation (CV)2.955377243
Kurtosis64.62703588
Mean0.2167407133
Median Absolute Deviation (MAD)0
Skewness5.374420631
Sum7025
Variance0.410305035
MonotonicityNot monotonic
2022-07-07T16:03:11.506652image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
027745
85.6%
13065
 
9.5%
21160
 
3.6%
3268
 
0.8%
4117
 
0.4%
529
 
0.1%
616
 
< 0.1%
75
 
< 0.1%
161
 
< 0.1%
181
 
< 0.1%
Other values (5)5
 
< 0.1%
ValueCountFrequency (%)
027745
85.6%
13065
 
9.5%
21160
 
3.6%
3268
 
0.8%
4117
 
0.4%
529
 
0.1%
616
 
< 0.1%
75
 
< 0.1%
81
 
< 0.1%
101
 
< 0.1%
ValueCountFrequency (%)
181
 
< 0.1%
161
 
< 0.1%
151
 
< 0.1%
141
 
< 0.1%
111
 
< 0.1%
101
 
< 0.1%
81
 
< 0.1%
75
 
< 0.1%
616
< 0.1%
529
0.1%

days_in_waiting_list
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct75
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2623411082
Minimum0
Maximum223
Zeros32235
Zeros (%)99.5%
Negative0
Negative (%)0.0%
Memory size253.3 KiB
2022-07-07T16:03:11.686976image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum223
Range223
Interquartile range (IQR)0

Descriptive statistics

Standard deviation4.733026518
Coefficient of variation (CV)18.04149777
Kurtosis748.8413878
Mean0.2623411082
Median Absolute Deviation (MAD)0
Skewness24.80627415
Sum8503
Variance22.40154002
MonotonicityNot monotonic
2022-07-07T16:03:11.856751image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
032235
99.5%
596
 
< 0.1%
716
 
< 0.1%
606
 
< 0.1%
256
 
< 0.1%
45
 
< 0.1%
145
 
< 0.1%
55
 
< 0.1%
465
 
< 0.1%
285
 
< 0.1%
Other values (65)128
 
0.4%
ValueCountFrequency (%)
032235
99.5%
13
 
< 0.1%
22
 
< 0.1%
45
 
< 0.1%
55
 
< 0.1%
64
 
< 0.1%
74
 
< 0.1%
81
 
< 0.1%
93
 
< 0.1%
101
 
< 0.1%
ValueCountFrequency (%)
2231
 
< 0.1%
1852
< 0.1%
1831
 
< 0.1%
1751
 
< 0.1%
1651
 
< 0.1%
1542
< 0.1%
1221
 
< 0.1%
1211
 
< 0.1%
1171
 
< 0.1%
1134
< 0.1%

customer_type
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size253.3 KiB
Transient
27461 
Transient-Party
4427 
Contract
 
359
Group
 
165

Length

Max length15
Median length9
Mean length9.788072319
Min length5

Characters and Unicode

Total characters317251
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTransient
2nd rowTransient
3rd rowTransient
4th rowTransient
5th rowTransient

Common Values

ValueCountFrequency (%)
Transient27461
84.7%
Transient-Party4427
 
13.7%
Contract359
 
1.1%
Group165
 
0.5%

Length

2022-07-07T16:03:12.018071image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-07T16:03:12.171700image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
transient27461
84.7%
transient-party4427
 
13.7%
contract359
 
1.1%
group165
 
0.5%

Most occurring characters

ValueCountFrequency (%)
n64135
20.2%
t37033
11.7%
r36839
11.6%
a36674
11.6%
T31888
10.1%
s31888
10.1%
i31888
10.1%
e31888
10.1%
y4427
 
1.4%
-4427
 
1.4%
Other values (7)6164
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter275985
87.0%
Uppercase Letter36839
 
11.6%
Dash Punctuation4427
 
1.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n64135
23.2%
t37033
13.4%
r36839
13.3%
a36674
13.3%
s31888
11.6%
i31888
11.6%
e31888
11.6%
y4427
 
1.6%
o524
 
0.2%
c359
 
0.1%
Other values (2)330
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
T31888
86.6%
P4427
 
12.0%
C359
 
1.0%
G165
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
-4427
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin312824
98.6%
Common4427
 
1.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
n64135
20.5%
t37033
11.8%
r36839
11.8%
a36674
11.7%
T31888
10.2%
s31888
10.2%
i31888
10.2%
e31888
10.2%
y4427
 
1.4%
P4427
 
1.4%
Other values (6)1737
 
0.6%
Common
ValueCountFrequency (%)
-4427
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII317251
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n64135
20.2%
t37033
11.7%
r36839
11.6%
a36674
11.6%
T31888
10.1%
s31888
10.1%
i31888
10.1%
e31888
10.1%
y4427
 
1.4%
-4427
 
1.4%
Other values (7)6164
 
1.9%

required_car_parking_spaces
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size253.3 KiB
0
30935 
1
 
1468
2
 
6
8
 
2
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters32412
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
030935
95.4%
11468
 
4.5%
26
 
< 0.1%
82
 
< 0.1%
31
 
< 0.1%

Length

2022-07-07T16:03:12.326742image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-07T16:03:12.486637image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
030935
95.4%
11468
 
4.5%
26
 
< 0.1%
82
 
< 0.1%
31
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
030935
95.4%
11468
 
4.5%
26
 
< 0.1%
82
 
< 0.1%
31
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number32412
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
030935
95.4%
11468
 
4.5%
26
 
< 0.1%
82
 
< 0.1%
31
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common32412
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
030935
95.4%
11468
 
4.5%
26
 
< 0.1%
82
 
< 0.1%
31
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII32412
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
030935
95.4%
11468
 
4.5%
26
 
< 0.1%
82
 
< 0.1%
31
 
< 0.1%

total_of_special_requests
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6577810687
Minimum0
Maximum5
Zeros17338
Zeros (%)53.5%
Negative0
Negative (%)0.0%
Memory size253.3 KiB
2022-07-07T16:03:12.620462image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8343410618
Coefficient of variation (CV)1.268417565
Kurtosis1.170586881
Mean0.6577810687
Median Absolute Deviation (MAD)0
Skewness1.214905428
Sum21320
Variance0.6961250074
MonotonicityNot monotonic
2022-07-07T16:03:12.755048image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
017338
53.5%
110037
31.0%
23988
 
12.3%
3907
 
2.8%
4124
 
0.4%
518
 
0.1%
ValueCountFrequency (%)
017338
53.5%
110037
31.0%
23988
 
12.3%
3907
 
2.8%
4124
 
0.4%
518
 
0.1%
ValueCountFrequency (%)
518
 
0.1%
4124
 
0.4%
3907
 
2.8%
23988
 
12.3%
110037
31.0%
017338
53.5%

total_nights
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.207978526
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size253.3 KiB
2022-07-07T16:03:12.887619image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile7
Maximum10
Range9
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.73868266
Coefficient of variation (CV)0.5419870009
Kurtosis0.3797450506
Mean3.207978526
Median Absolute Deviation (MAD)1
Skewness0.8445081871
Sum103977
Variance3.023017394
MonotonicityNot monotonic
2022-07-07T16:03:13.007010image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
38395
25.9%
26760
20.9%
45828
18.0%
15465
16.9%
52409
 
7.4%
72284
 
7.0%
6939
 
2.9%
8215
 
0.7%
962
 
0.2%
1055
 
0.2%
ValueCountFrequency (%)
15465
16.9%
26760
20.9%
38395
25.9%
45828
18.0%
52409
 
7.4%
6939
 
2.9%
72284
 
7.0%
8215
 
0.7%
962
 
0.2%
1055
 
0.2%
ValueCountFrequency (%)
1055
 
0.2%
962
 
0.2%
8215
 
0.7%
72284
 
7.0%
6939
 
2.9%
52409
 
7.4%
45828
18.0%
38395
25.9%
26760
20.9%
15465
16.9%

Interactions

2022-07-07T16:03:00.946613image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:40.667432image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:42.476712image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:44.487046image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:46.481698image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:48.426651image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:50.886653image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:52.886590image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:55.046916image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:57.076825image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:59.041729image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:03:01.091667image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:40.819361image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:42.647077image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:44.657725image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:46.646772image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:48.591997image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:51.046821image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:53.076811image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:55.216768image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:57.246727image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:59.192079image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:03:01.257826image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:40.992211image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:42.840945image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:44.846895image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:46.827743image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:48.774991image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:51.226785image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:53.268138image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:55.407861image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:57.416610image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:59.366974image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:03:01.431406image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:41.164912image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:43.026708image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:45.036980image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:47.013140image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:49.446924image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:51.436949image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:53.506683image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:55.606685image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:57.591507image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:59.538928image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:03:01.596877image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:41.336920image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:43.217702image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:45.216921image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:47.188881image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:49.641647image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:51.626894image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:53.686843image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:55.791327image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:57.757026image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:59.712030image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:03:01.746838image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:41.496898image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:43.396728image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:45.399947image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:47.366694image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:49.836471image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:51.797031image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:53.896983image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:55.966800image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:57.916523image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:59.877123image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:03:01.916639image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:41.666594image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:43.586929image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:45.586639image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:47.546715image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:50.026887image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:51.971774image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:54.091731image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:56.156793image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:58.081884image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:03:00.066511image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:03:02.117010image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:41.836721image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:43.781370image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:45.768223image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:47.729547image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:50.209581image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:52.156971image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:54.296736image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:56.357072image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:58.248210image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:03:00.256682image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:03:02.296780image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:42.009077image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:43.972012image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:45.963878image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:47.927038image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:50.397013image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:52.356793image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:54.506946image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:56.556615image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:58.579391image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:03:00.456883image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:03:02.456682image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:42.156694image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:44.146732image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:46.131593image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:48.086641image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:50.559151image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:52.536945image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:54.686975image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:56.736662image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:58.734157image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:03:00.621727image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:03:02.616776image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:42.326671image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:44.319859image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:46.309113image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:48.259202image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:50.726880image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:52.716675image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:54.876558image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:56.906965image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:02:58.896732image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-07T16:03:00.776873image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2022-07-07T16:03:13.171976image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-07-07T16:03:13.634628image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-07-07T16:03:14.066707image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-07-07T16:03:14.539106image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-07-07T16:03:14.865054image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-07-07T16:03:02.936656image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2022-07-07T16:03:03.951655image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

idis_canceledlead_timearrival_date_yeararrival_date_montharrival_date_week_numberarrival_date_day_of_monthstays_in_weekend_nightsstays_in_week_nightsadultschildrenbabiesmealcountrydistribution_channelis_repeated_guestprevious_cancellationsprevious_bookings_not_canceledreserved_room_typebooking_changesdays_in_waiting_listcustomer_typerequired_car_parking_spacestotal_of_special_requeststotal_nights
06086174.02017January11102.00.00.0BBPRTTA/TO000A00Transient001
16087162.02017January11222.00.00.0BBAUTTA/TO000A00Transient014
26088162.02017January11222.00.00.0BBAUTTA/TO000A00Transient014
36089171.02017January11221.00.00.0BBPRTTA/TO000A00Transient014
460901172.02017January11252.00.00.0BBBELTA/TO000A00Transient007
56091152.02017January11251.00.00.0BBDEUTA/TO000A00Transient007
660921143.02017January12112.00.00.0BBBRADirect000A10Transient012
76093121.02017January12132.00.00.0BBBRATA/TO000A00Transient014
86094189.02017January12132.00.00.0BBGBRTA/TO000E00Transient004
96095148.02017January12142.00.00.0BBPRTDirect000A10Transient025

Last rows

idis_canceledlead_timearrival_date_yeararrival_date_montharrival_date_week_numberarrival_date_day_of_monthstays_in_weekend_nightsstays_in_week_nightsadultschildrenbabiesmealcountrydistribution_channelis_repeated_guestprevious_cancellationsprevious_bookings_not_canceledreserved_room_typebooking_changesdays_in_waiting_listcustomer_typerequired_car_parking_spacestotal_of_special_requeststotal_nights
32402978940185.02017August3530142.00.00.0SCCHETA/TO000A00Transient015
32403978950247.02017August3531132.00.00.0BBGBRTA/TO000A00Transient004
32404978960109.02017August3531132.00.00.0BBGBRTA/TO000D00Transient014
3240597897044.02017August3531132.00.00.0SCDEUTA/TO000A00Transient014
32406978980188.02017August3531232.00.00.0BBDEUDirect000A00Transient005
32407978990164.02017August3531242.00.00.0BBDEUTA/TO000A00Transient006
3240897900021.02017August3530252.00.00.0BBBELTA/TO000A00Transient027
3240997901023.02017August3530252.00.00.0BBBELTA/TO000A00Transient007
3241097902034.02017August3531252.00.00.0BBDEUTA/TO000D00Transient047
32411979030109.02017August3531252.00.00.0BBGBRTA/TO000A00Transient007